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Abstract 

An error occurred in the computation of a gradient in [T]. The equa- 
tions (20) in Appendix and (17) in the text were not correct. The current 
paper presents the correct version of these equations. 

I Summary of |[1] 

In [1] (see Appendix for an authors' version of this article), we proposed a 
maximum likelihood approach for blindly separating a linear-quadratic mixture 
defined by (Eq. (2) in [T]): 

Xi ^ Si - liS2 - qiSiS2 X2 = 32 ~ hsi - q2SlS2 (I.l) 

where si and S2 are two independent sources. The log-likelihood for N samples 
of the mixed signals xi and X2 reads (Eq. (12) in 1 ): 

L = EtilogfsAsiim + EtilogfsMm - Et[log I J(si(i),S2(i))|] (1-2) 

where Et[.] represents the time average operator on the N samples, fsi{-) and 
/s2(.) are the probability density functions (pdf) of the sources si and S2 and 
J is the Jacobian of the mixture which reads (Eq. (4) in [T]) 

J = 1 - I1I2 - {q2 + hqi)si - [qi + hq2)s2- (1-3) 
Maximizing the log-likclihood requires that its gradient with respect to the pa- 

dL 
9w' 

of the two sources as (Eq. (13) in [1]) 



rameter vector w = [li,l2,qi, 92], *-e- vanishes. Defining the score functions 



d\ogfs,{u) . 

= r Z = 1, 2 

ou 



we can write (Eq. (14) in [J) 



1 



Rewriting in the vector form x = f(s,w) and considering w as the inde- 
pendent variable and s as the dependent variable, we can write, using implicit 
differentiation (Eq. (15) in |T]) 



^ _d{ ds df 
ds dw dw 



which yields (Eq. (16) in [T]) 



ds_ 
dw 



1 9f 
' 9s dw 



(1.5) 



(1.6) 



Note that ^ is the Jacobian matrix of the mixing model. Considering p.ip , we 
can write (Appendix in [1 ) 

1 - qiS2 "h - qisi 

-h - q2S2 1 - 9251 



ds 



and 1^ = 

aw 



-S2 

-Si 



-SlS2 







-SlS2 



which implies, from (jl.6p 

_ -1 / l-q2Si h+qisi 
dw J \ I2 + 92^2 1 - qiS2 



-32 









'SIS2 






-S1S2 



(1.7) 



and yields (Eq. (19) in [J) 

(1 - q2Si)s2 , (^1 + qisi)si , (1 - q2Si)siS2 , (^1 + qiSi)siS2 

{h + q2S2)s2 , (1 - qiS2)si , {I2 + q2S2)siS2 , (1 " qiS2)siS2 



dsi 


1 


dw 


J - 


ds2 


1 " 


dw 


J ■ 



Using (jl.8| . we obtain the first two terms of the gradient ()I.4p . To obtain the 
third term, we need to compute J^. This partial derivative was computed 
inaccurately in pT so that Equations (20), and thus (17), in [T] are erroneous. 



II Correct versions of Equations (20) and (17) 
in [Jj 

In [T] we did not consider the implicit dependence of si and S2 on w and 
computed the derivative of J with respect to w ignoring this dependence. Con- 



sidering J = g{w, s(w)), the correct equation for ^ reads 



dJ_ 
dw 



dJ_ 
dw 



dw 

dJ ds 
ds dw 



(III) 



Equation (20) in T\ only corresponded to the first term on the right side of the 
above relation which reads, following (jl.3[) . as: 



dJ_ 
dw\. 



h + 92S2, ^1 + giSl, /2S1 -I- S2,Sl + I1S2 



(II.2) 
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We now compute the gradient pi.ll) entirely. Considering (|I.3[) . we can write 

dJ 

— = -[(72 + ;29i,gi +^192] (11.3) 

Using pi.ll) , (111.21) , (jll.3|) and (jl.7[) we finally obtain the following equation which 
must replace the equation (20) in [T] 

dJ 

TT— = [-{h + <72S2) - ((72 + ^2gi)(l - q2Sl)s2/J - {qi + ^iq2)(^2 + <Z2S2)s2/J, 

aw 

-(^1 + qisi) - {qi + hq2){^ - qiS2)si/J - {q2 + hqijiW + qiSi)si/J, 
-{hsi + S2) - ((72 + ^2gi)(l - q2Si)siS2/J - {qi + hq2){h + q2S2)siS2/J, 
-(/1S2 + si) - (qi + liq2){l - qiS2)siS2/J - (92 + l2qi){h + 9iSi)siS2/J] 

(II.4) 

Inserting (jl.SP and (jll.4p in (jl.4|) . we obtain the following expression for the 
gradient which must replace Equation (17) in [T] 



dw 



= -Et 



{j]l{si){l - q2Si)s2 + 1p2{s2){h + q2S2)s2 

ih + 92S2) - (92 + ^2gi)(l - q23i)s2/J - {qi + hq2){h + q2S2)s2/J ) /J, 
i'iisi){li + qisi)si + ip2{s2){^ - qiS2)si 

ih + qisi) - (qi + /ig2)(l - qiS2)si/J - {q2 + hqijih + qiSi)si/J ) / J, 

V'l(si)(l - q2Sl)siS2 + i^2{s2){l2 + q2S2)siS2 

{I2S1 + S2) ~ ((?2 + ^2<7i)(l - q2Si)siS2/ J - {qi + Iiq2){l2 + q2S2)siS2/J ) /J, 
1pl{si){ll + qiSi)siS2 + V'2(S2)(1 - qiS2)siS2 



- {I1S2 + Si) - (qi + hq2){l - qiS2)siS2/ J - {q2 + hqijih + qiSi)siS2/J) J/J 

(II.5) 
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APPENDIX: Authors' version of [1] 



1 Introduction 

It is well known that the independence hypothesis is not sufficient for separat- 
ing general nonlinear mixtures because of the very large indcterminacies which 
make the nonlinear BSS problem ill-posed. A natural idea for reducing the in- 
determinacies is to constrain the structure of mixing and separating models to 
belong to a certain set of transformations. This supplementary constraint can 
be viewed as a regularization of the initially ill-posed problem. 

In this paper, we study a linear-quadratic mixture model which may be 
considered as the simplest (nonlinear) version of a general polynomial model. 
Our main aim is to develop an approach which can be easily extended to higher- 
order polynomial models. Hence, we propose a recurrent separating structure 
whose realization does not require the knowledge of the explicit form of the 
inverse of the mixing model. We develop a rigorous method to identify the 
parameters of the separating structure in a maximum likelihood framework. 
The algorithm is developed so that the inverse of the mixing structure is not 
required to be known. Thus, it can be extended to more general polynomial 
mixtures. 

2 mixing and separating models 

Suppose Ml and U2 are two independent random signals. Given the following 
nonlinear instantaneous mixture model 

^ anUi + a^2U2 + biUiU2 i ^ 1,2 (1) 

we would like to estimate ui and U2 up to a permutation and a scaling factor 
(and possibly an additive constant). For simplicity, let's denote si = anui and 
S2 = a22U2- si and S2 will be referred to as the sources in the following. (jT]) can 
be rewritten as 

xi = si - I1S2 - qiSiS2 

X2 = S2 - hsi - q2SlS2 (2) 

in which li = —012/^22 ^^nd I2 = ^a-2i/o,ii represent the linear contributions of 
the sources in the mixture, and qi = —61/(011022) and (72 — —62/(011022) repre- 
sent the quadratic contributions. The negative signs are chosen for simplifying 
the notations of the separating structure. 

Solving the model ([2]) for si and S2 leads to the following two pairs of solu- 
tions, which may be considered as two direct separating structures: 

(/i, /2)i = ((-61 + yAT)/2oi, (-62 + v^)/2o2) 

ill, 12)2 = ((-61 - yA7)/2oi, (-62 - yA^)/202) (3) 
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Figure 1: Case when J > for all the source values. Distribution of (a) sources, 
(b) mixtures, (c) output of the first direct separating structure, (d) output of 
the second direct separating structure. 



where A,; = hf-AaiCi, ai = 92 + ^291, 02 = qi+hq2, h = qiX2 -922:1 + ^1^2 - 1, 
62 = 922^1 — 5i2;2 + ^1^2 — 1, ci — xi + I1X2 and C2 = a;2 + '2a;i- It can be easily 
verified that Ai = A2 = J^, where J is the Jacobian of the mixing model ([2]) 
and reads 

J = 1 - - {q2 + hqi)si - {qi + hq2)s2 (4) 

According to the variation domain of the two sources, three different cases may 
be considered: 

1) J < for all the values of si and S2. In this case ([3]) becomes: 

(/l,/2)l = (si,S2) (5) 

,rr^ I qi + hq2 hh-i q2 + l2qi hh-^ . 

(il,/2 2= —, S2 , — Si 6) 

92 + <2f?i q2 + l2qi qi + kq2 qi + kq2 

Thus, the first direct separating structure in ([3]) leads to the actual sources and 
the second direct separating structure leads to another solution, equivalent to 
the first one up to a permutation, a scaling factor, and an additive constant. 

2) J > for all the values of si and S2. In this case, the first structure 
leads to the permuting solution, defined by ([6]) , and the second structure to the 
actual sources (si, 52)- An example is shown in Fig. [T]for the numerical values 
h = -0.2, I2 = 0.2, qi = -0.8, q2 = 0.8 and e [-0.5,0.5]. 

3) J > for some values of the sources and J < for the other values. In 
this case, each structure leads to the non-permuted sources ([5]) for some values 
of the observations and to the permuted sources ^ for the other values. An 
example is shown in Fig. [5] (with the same coefficients as in the second case, 
but for Si G [—2, 2]). The permutation effect is clearly visible in the figure. One 
may also remark that the straight line J = in the source plane is mapped to 
a conic section in the observation plane (shown by asterisks). 

Thus, it is clear that the direct structures may be used for separating the 
sources if the Jacobian of the mixing model is always negative or always positive. 




Figure 2: Case when J > for some values of the sources and J < for the 
other values. Distribution of (a) sources, (b) mixtures, (c) output of the first 
direct separating structure, (d) output of the second direct separating structure. 



i.e. for all the source values. Otherwise, although the sources are separated sam- 
ple by sample^ each retrieved signal contains samples of the two sources. This 
problem arises because the mixing model ^ is not bijective. This theoretically 
insoluble problem should not discourage us. In fact, our final objective is to ex- 
tend the idea developed in the current study to more general polynomial models 
which will be used to approximate the nonlinear mixtures encountered in the 
real world. If these real-world nonlinear models are bijective, we can logically 
suppose that the coefficients of their polynomial approximations take values 
which make them bijective on the variation domains of the sources. Thus, in 
the following, we suppose that the sources and the mixture coefficients have nu- 
merical values ensuring that the Jacobian J of the mixing model has a constant 
sign. 

The natural idea to separate the sources is to form a direct separating struc- 
ture using any of the equations in ([3]), and to identify the parameters ^i, I2, 
qi and (72 by optimizing an independence measuring criterion. Although this 
approach may be used for our special mixing model ©, as soon as a more 
complicated polynomial model is considered, the solutions {fi, J2) can no longer 
be determined so that the generalization of the method to arbitrary polyno- 
mial models seems impossible. To avoid this limitation, we propose a recurrent 
structure shown in Fig. [3] Note that, for (71 = 92 = 0, this structure is reduced 
to the basic Herault-Jutten network. It may be checked easily that, for fixed 
observations defined by ([2|), yi = si and 2/2 = S2 corresponds to a steady state 
for the structure in Figure |3l 

The use of this recurrent structure is more promising because it can be easily 
generalized to arbitrary polynomial models. However, the main problem with 
this structure is its stability. In fact, even if the mixing model coefficients are 
exactly known, the computation of the structure outputs requires the realization 
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Figure 3: Recurrent separating structure. 

of the following recurrent iterative model 

yi{n + 1) ^ xi+ hy2{n) + qiyi{n)y2{n) 

y2{n + 1) = X2 + hyiin) + 922/1 (^^) 2/2 (^^) (7) 

where a loop on n is performed for each couple of observations (xi,X2) until 
convergence is achieved. 

It can be shown that this model is locally stable at the separating point 
(?/1j2/2) — (81,82), if and only if the absolute values of the two eigenvalues of 
the Jacobian matrix of (O are smaller than one. In the following, we suppose 
that this condition is satisfied. 



3 Maximum likelihood estimation of the model 
parameters 

Let /si,S2 (sij S2) be the joint pdf of the sources, and assume that the mixing 
model is bijective so that the Jacobian of the mixing model has a constant sign 
on the variation domain of the sources. The joint pdf of the observations can 
be written as 

/ N f 31,32(81,82) 

fxi,X2(xi,X2) = —— -— (8) 

\J(8l,S2)\ 

Taking the logarithm of and considering the independence of the sources, 
we can write: 

log/xi,X2(a;i,x2) =log/si(si) + log/s2(s2) - log | J(si, 52)! (9) 

Given N samples of the mixtures Xi and X2, we want to find the maximum 
likelihood estimator for the mixture parameters w = [li,l2,qi, (72]- This estima- 
tor is obtained by maximizing the joint pdf of all the observations (supposing 
that the parameters in w are constant), which is equal to 

E = fx„X2(xi(l),X2(l), - ■ ■ ,xi(N),X2(N)) (10) 

If si(t) and 82(t) are two i.i.d. sequences, xi(t) and X2(t) are also i.i.d. so that 
E = I\t=i fxuX2(xi(i),X2(i)) and \ogE = J2t=i^og fxuX2(xi(i), X2(i)). The 
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cost function to be maximized can be defined as L = log E, which will be 
denoted using the temporal averaging operator Et[.] as 

L = Et[logfxuxAMt),X2m (11) 

Using 

L = Et [log fs, (si m + Et [log fs, {S2 m - Et [log | J(si (i), S2 (t)) |] (12) 

Maximizing this cost function requires that its gradient with respect to the 

dL 
dvr ' 



parameter vector w, i.e. vanishes. Defining the score functions of the two 



sources as 

aiog/s.(u) . „ 
du « = 1:2 (13) 

and considering that ^^^^ = ^--i^, we can write 

^ aw J aw ' 

Rewriting ([2]) in the vector form x = f(s.w) and considering w as the inde- 
pendent variable and s as the dependent variable, we can write, using implicit 
differentiation 

^_d^ds_ df_ 
ds dw 9w 

which yields 

9s ^di I di 
9w 9s dw 

Note that ^ is the Jacobian matrix of the mixing model. Using p4)) and (|T6)) . 
the gradient of the cost function L with respect to the parameter vector w is 
equal to (see the appendix for the computation details) 

= -Et [lpl{si){l - q2Sl)s2 + 1p2{s2){l2 + q2S2)s2 - [h + g2S2))/J, 

(V'i(si)(^i + qisi)si + V'2(s2)(l - qiS2)si - {h + qiSi))/J, 

(V'l(si)(l - q2Sl)siS2 + 1p2{s2){l2 + q2S2)siS2 - (hsi + S2)) / J, 
(V'l(si)('l + qiSl)siS2 + ■02(S2)(1 - qiS2)siS2 ~ {si + I1S2)) / J (17) 

In practice, the actual sources and their density functions are unknown and will 
be replaced by the reconstructed sources, i.e. by the outputs of the separating 
structure of Fig |31 yi, in an iterative algorithm. The score functions of the 
reconstructed sources can be estimated by any of the existing parametric or non- 
parametric methods. In our work, we used a kernel estimator based on third- 
order cardinal splines. Using (jl7p . the cost function (1121) can be maximized by a 
gradient ascent algorithm which updates the parameters by the rule w(n-|- 1) = 
w(7T,) -I- The learning rate parameter /i must be chosen carefully to avoid 

the divergence of the algorithm. Note that the algorithm does not require 
the knowledge of the explicit inverse of the mixing model (direct separating 
structures ([3])). Hence, it can be easily extended to more general polynomial 
mixing models. 
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Appendix: details of gradient computation 



Considering ([2]), we can write 
df ^ f 1 ~ 91*2 -h - qisi 

\ -h - q2S2 1 - 92Sl 
which imphes, from ([T| 



and ^ 

aw 



— S2 — S1S2 
—Si — S1S2 



_9s_ _ -1 / l-q2Si h + qisi 
i9w J \ I2 + 1 - qiS2 



-S2 

-Si 



-S1S2 
— S1S2 



5si 


1 


dvf 


^ J - 


ds2 


1 " 


dw 


~ J - 



which yields 

(1 - q2Si)s2 , {h + qisi)si , (1 - g2Si)siS2 , (^1 + '?iSi)siS2 

(^2 + g2S2)s2 , (1 - giS2)si , (h + q2S2)siS2 , (1 - qiS2)siS2 

Considering ^ 

r 

h + 92S2, h + qisi.hsi + S2, si + ZiS2 
(fT7|) follows directly from (fTi l) . (fTSl) and (fT9l) . 



dJ_ 



(18) 



(19) 
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